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Three hundred and twenty-four middle school students considered a group of three 
graphs in a newspaper article about boating deaths. The graphs contained 
discrepancies and the students were asked to “comment on unusual features. ” This 
form of questioning produced a distribution of responses surprising to the authors 
and perhaps challenging to current goals for statistical literacy. Of these students, 
201 answered the same question two years later and although overall performance 
improved to some extent there were still very few high level responses. The outcomes 
point to specific suggestions that can be made for middle school classrooms in line 
with the goals of statistical literacy. 

INTRODUCTION 

Quantitative literacy and critical numeracy have emerged as avenues for considering 
mathematics in a refonn curriculum aimed at catering for all students (Steen, 2001); 
in the same way statistical literacy is taking the chance and data curriculum to a wider 
audience. Adults need to interpret the infonnation with which they are inundated 
daily; but what are the criteria for effective decision-making? International adult 
literacy surveys (e.g., Dossey, 1997) have considered document literacy and 
quantitative literacy alongside prose literacy as significant tools required by adults in 
western society. The tasks employed in these surveys have a strong reliance on 
statistical ideas, particularly graph interpretation. 

Gal (2002, pp. 2-3) suggested that statistical literacy considers people’s ability to 
interpret and critically evaluate statistical information, and their ability to 
communicate their understanding, concerns, and reactions. Watson (1997) proposed a 
three-tiered hierarchy for statistical literacy, incorporating (i) an understanding of 
basic statistical terminology and tools, (ii) an understanding of these terms and tools 
within societal contexts, and (iii) the ability to question claims made without 
appropriate justification. These steps are similar to the code-breaking, text-meaning 
and usage, and critical thinking components associated with models of critical 
literacy (e.g., Freebody & Luke, 2003). 

The current study arose from a larger project that was focused on school students’ 
appreciation of variation as the foundation of the chance and data curriculum (see, 
e.g., Watson & Kelly, 2002, 2003). Although the aim of the larger study was to 
describe the development of the understanding of statistical variation, tasks were 
developed in contexts that employed the specific topics in the curriculum, such as 
chance events, averages, and graphs, and thus considered aspects of statistical literacy 
as well. The contexts varied, including simple settings such as rolling a die, familiar 
settings such as a school survey, and unfamiliar social settings such as would be 
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found outside of school. It is for this last setting and the topic of graphs that the task 
discussed here was devised. 

The graphs upon which the task was based are shown in Figure 1 (Flaley, 2000) and 
were chosen for their potential for measuring aspects of statistical literacy in a 
context involving authentic variation. The straightforward bar graph style was 
considered accessible to all students at the middle school level. Having three graphs 
instead of one allowed features to be compared and contrasted. The context for the 
graphs — boating deaths in the state where the students lived — was considered 
comprehensible, as well as a social issue worth considering in terms of the goals of 
statistical literacy. There were two anomalies in the graphs that would allow students 
to question and display critical statistical literacy skills. There was also variation in 
the graphs, which was an underlying feature of the larger study. 



BOATIES' SAFETY FAILURE 




These graphs were part of a newspaper story reporting on boating deaths in Tasmania. 
Figure 1: Set of three graphs used in the task (Haley, 2000 ) 

Of interest was what students would attend to when examining the graphs. What 
aspects of the graphs would they find “unusual”? Would they be influenced by the 
authentic nature of a newspaper extract and be unwilling to question it? Specifically, 
this study examines the categories of response that characterise middle school 
students’ descriptions of unusual features of bar graphs from the media (containing 
technical discrepancies). It also considers whether the responses change over a two- 
year period. 

METHODOLOGY 

Sample. The sample consisted of 156 students in Grade 7 (age 12-13) and 168 
students in Grade 9 (age 14-15) at four government high schools in the Australian 
state of Tasmania. Of these students, 137 in Grade 7 and 64 in Grade 9 responded 
again two years later. Fewer students were surveyed in Grade 1 1 because many leave 
or change schools at the end of Grade 10. 

Procedure. The task was Question 10 in a 45-minute written survey with 15 
questions, many with multiple parts (see Watson, Kelly, Callingham, & Shaughnessy, 
2003, for the full survey). It was the only question based on a bar graph or a graph 
from a newspaper. The instruction, “Comment on any unusual features of the 
graphs,” was intended to motivate students to consider various aspects without being 
so explicit as to influence students’ focus. Two large labelled spaces were provided to 
encourage careful consideration and reflect the plural use of the word “feature,” thus 
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permitting two responses. The task behaved well in a measurement sense (Watson et 
al., 2003) based on the two different hierarchical codings described below. 

Analysis. For the purposes of coding the two responses were treated together. Coding 
was conducted by a research assistant and the first author following the development 
of two coding schemes. The first scheme, in Table 1, reflected the appropriateness of 
responses based on the information in the graph and the steps to Critical Statistical 
Literacy (CSL) noted earlier. The four coding levels of increasing appropriateness 
had various subcategories defined to reflect the diversity of responses. The second 
coding scheme, shown in Table 2, was based on the increasing structural complexity 
of responses in Appreciation of Variation (VAR). Four levels of response were 
defined, with one having three subgroupings. As indicated by their definitions these 
two coding schemes were used to reflect the different possible interpretations of the 
task based on the twin aims of investigating variation and statistical literacy. 



Code 


Sub Code 


Description of Category for Critical Statistical Literacy (CSL) 


0 




Inappropriate responses 




0A 


No response 




0B 


Idiosyncratic/“nothing unusual” 




OC 


Inferring from graph: Advice 




0D 


Direct graph interpretation, without mentioning anything unusual 




0E 


Incorrect graph interpretation of unusual data 


1 




Partially correct interpretation: Unusual data or graphing 




1A 


Very general comments about graphing elements 




IB 


Both correct and incorrect interpretations of unusual data 


2 




Correct graph interpretation: Unusual data or graphing 




2A 


Correct but non-specific interpretation of unusual data 




2B 


Specific statistical comment about graphing elements 


3 




In-depth graph analysis: Recognises mistakes 




3A 


Identification of a mistake, but error in explanation 




3B 


Correct identification of at least one mistake 




Table 1: 


Coding scheme based on Critical Statistical Literacy criteria 


Code 


Sub Code 


Description of Category for Appreciation of Variation (VAR) 


0 




No acknowledgement of variation 


1 




Focus on columns 




1A 


Focus on a single column 




IB 


Comparison across two columns 




1C 


Focus on the highest column as “most” 


2 




Focus on increase in the data over time 


3 




Acknowledgement of variation 


Table 2: Coding scheme based on criteria related to Appreciation of Variation 


All responses had two codes associated with them, one for CSL and one for VAR. As 
an example, the response “That 35 people died from not wearing lifejackets, 8 from 
alcohol” was coded as 2A in the CSL coding scheme for its non-specific 
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interpretation of the data, and as 1 A in the VAR coding scheme for its focus on single 
columns. The research assistant coded the responses, which were checked by the first 
author, with inconsistencies decided by discussion (cf., Miles & Huberman, 1994). 

RESULTS 

The results report students’ response categories distinguished by the coding schemes 
for Critical Statistical Literacy (CSL) and Appreciation of Variation (VAR). Changes 
in response levels over two years are also reported for some students. Students’ full 
responses have been edited in some cases to show only the salient features. 

Responses for the CSL coding scheme 

As shown in Table 3, a large percent of students in both Grades 7 (44%) and 9 (40%) 
did not respond at all to the task (Category 0A). Typical responses in Category OB 
indicated nothing unusual or were idiosyncratic, such as “They all look okay to me.” 
Some students focused on giving advice based on the information in the graphs 
(Category 0C), rather than something unusual; for example, “People should wear life 
jackets.” Category 0D responses commented on something in the graph but without 
focusing on anything unusual, such as “The graphs show us that boats are just as 
dangerous as cars are.” Finally, Category 0E contained responses that identified as 
unusual something that would not be considered unusual in a statistical sense or that 
was not based on information in the graph, as seen in “Hardly anyone wore life 
jackets in 99” or “Less people died by not wearing lifejackets.” 



Code 
Sub Code 




0 




1 


2 


3 


Total 


0A 


OB 0C 0D 


0E 


1A IB 


2A 2B 


3A 3B 


Grade 7 


68 


7 5 3 


5 


2 3 


53 6 


2 2 




Subtotals 




88 (56) 




5(3) 


59 (38) 


4(3) 


156 


Grade 9 


67 


9 0 4 


0 


2 8 


61 13 


2 2 




Subtotals 




80 (48) 




10 (6) 


74 (44) 


4(2) 


168 



Table 3: CSL categories: number (percent) for each grade 

At Level 1, responses were partially correct and addressed unusual data or the format 
of the graph. Category 1A responses made very general or vague statements, such as 
“They’re all different graphs. They’re [sic] all got different meanings.” Category IB 
responses included both correct and incorrect interpretations. One student wrote 
“Most people drowned in 1999. A lot of people were tanked [drunk].” 

Responses at Level 2 reflected what students considered unusual features of the data 
or graphs but which were not related to the errors therein. In category 2A were non- 
specific comments about the unusual nature of the data. Examples include “The 
number of deaths has risen over the years.” The other subcategory of Level 2 (2B) 
was much smaller, consisting of at least one comment on something unusual about 
the graphs themselves; for instance, “The way they’re set out. They don’t have 
anything telling you what the Y and X axes are.” 
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Of the Level 3 responses that found mistakes, the first group (3A) made errors in 
reporting these, e.g., “Well on graph 1 it says there is a total of 46 but I counted and it 
has only got 38.” In the final group (3B), responses focused correctly on the 
discrepancies in the graphs, such as “The first graph has a mistake, the 6 is on 2.” 

Responses for the VAR coding scheme 

Level 0 responses for the VAR coding summarized in Table 4 included both non- 
responses and responses that had no comments that would indicate a student had 
considered change or variation in the graphs. Many of the latter, such as “The first 
one says total: 46, but the graph shows 50 people” may have been placed at much 
higher levels in the CSL coding. 



Code 


0 


1A 


IB 


1C 


2 


3 


Total 


Grade 7 


89 (57) 


40 (26) 


3(2) 


14(9) 


4(3) 


6(4) 


156 


Grade 9 


97 (58) 


26(15) 


13(8) 


22(13) 


5(3) 


5(3) 


168 



Table 4: Variation categories: numbers (percent) for each grade 

At Level 1 there were three subcategories focusing on columns. In the first (1A), 
attention was given to a single column. One student wrote “35 people weren’t 
wearing a life jacket. Tonnes of people keeled over [died] in ’99.” Category IB 
responses considered two columns: for example, “From ’87 the deaths have shot up 
from 6 to 12”. In the third group (1C), responses focused on the highest column as 
“most”, as exemplified in “There were more deaths in ’99 than any others.” 

Level 2 responses recognised increases in the data over time, and included “The 
number of deaths has risen over the years.” Level 3 responses made a comment 
relating to the variation, such as “Most of the people who die were spread out but 
there was a major increase in ’99.” 

Change in response categories over two years 

Table 5 shows the categories of responses related to CSL for the subgroup who 
completed the survey two years later. There was some improvement, with fewer 
students responding at Level 0. The Grade 7/9 students performed better than the 
original Grade 9 cohort, due to an increased number of Level 1 responses. Both 
groups showed a small increase in the number of Level 3 responses, and the Grade 
9/1 1 students also increased their number of Level 2 responses. 

For students with complete data across both years, 43% of Grade 7/9 and 56% of 
Grade 9/11 responded at the same level (0 to 3), whereas 39% of Grade 7/9 and 26% 
of Grade 9/11 improved. One Grade 9 student who had originally written “Too many 
recreational deaths. Not many alcohol related deaths” (2A) gave the following 
Category 0C response two years later: “People wearing a life jacket, in sheltered 
waters in a boat under 6m should always have a life jacket.” A Grade 7 student who 
had originally identified a discrepancy in the graph (3B) later focused only on 
specific aspects of the data, saying “a lot of people die in 1999” (2B). One Grade 7 
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student who initially focused on specific data elements in writing “Not many people 
had life jackets. Tons of people drunk” (IB), identified the error in the graph totals in 
his longitudinal response (3B). A Grade 9 student who gave no response at all in the 
initial survey later engaged in the task with a Category IB response: “99 results are 
extremely high all of a sudden. Alcohol was the cause for almost half.” 



Code 
Sub Code 




0 




1 


2 


3 


Total 


0A 


OB 0C 0D 


0E 


1A IB 


2A 2B 


3A 3B 


Grade 7/9 


27 


1 2 5 


6 


0 29 


50 5 


7 5 




Subtotals 




41 (30) 




29(21) 


55 (40) 


12 (9) 


137 


Grade 9/1 1 


7 


5 1 3 


2 


0 7 


26 7 


3 2 




Subtotals 




18(28) 




7(H) 


33 (52) 


6(10) 


64 



Table 5: CSL categories — Longitudinal survey (cf. Table 3) 

Table 6 reports on similar data but for the VAR coding. Again there was some 
improvement over the two-year period, and once more the Grade 7/9 students 
performed better than their earlier Grade 9 counterparts. Most of the change was due 
to increased numbers in Category 1C and Level 2. Responses clearly articulating 
variation across the graphs (Level 3) were still rare; however some of them indicated 
a significant change for some individuals. One Grade 7 student’s response in the first 
survey — “I can’t see anything in them. I don’t know what the 87 to 99 means” — was 
coded low on both CSL and VAR, but the student’s response in the second survey 
recognized that the range is big (Level 3). Another student initially could identify 
particular data elements, and then two years later had a more holistic view, writing “I 
think that the graph ‘Recreational boating deaths’ was fairly inconsistent throughout 
the years and had a sudden jump at the end (99)” (Level 3). 



Code 


0 


1A 


IB 


1C 


2 


3 


Total 


Grade 7/9 


47 (34) 


30 (22) 


15(11) 


20 (15) 


17(12) 


8(6) 


137 


Grade 9/1 1 


23 (36) 


10(16) 


2(3) 


14 (22) 


11(17) 


4(6) 


64 



Table 6: VAR categories — Longitudinal survey (cf. Table 4) 

For students with complete data across both years, 43% of Grade 7/9 and 44% of 
Grade 9/11 responded at the same level (0 to 3), whereas 45% of Grade 7/9 and 32% 
of Grade 9/11 improved. There were some who declined quite markedly from the 
first survey to the second. This includes students who recognized variation initially 
but who, in the second survey, focused on single columns (e.g., “In 99 they went up 
but they should have been down because of the new technology and laws”) or 
claimed not to see any “unusual features” at all. Considering both CSL and VAR, no 
student changed from Level 3 on one to Level 3 on the other. Overall, 14% 
performed at Level 3 on at least one (one student did so on both). 

DISCUSSION 

The importance of variation and critical statistical literacy 
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If the task had been more specific — targeting a particular aspect of either variation or 
statistical literacy — larger numbers of students may have given higher level 
responses. It is important to recognize, however, that in articles like that used as a 
basis for the task and in much of the data encountered daily in the real world, the data 
come with no questions at all. Accompanying reports often present the writer’s 
interpretation, which may be biased or incorrect. Moreover, as seen here, the actual 
data as presented come with no guarantees of correctness. Given this, the results of 
this study are of concern, since students appear to lack strategies for searching for the 
“unusual”. They rarely query the data or examine the data in a holistic way. For the 
students in this study, the kind of critical thinking suggested by Gal (2002) and 
explicated as the third step of Watson’s 1997 hierarchy seems to be uncommon. 

For the CSL coding, 40% to 50% of the students could make generally meaningful 
comments about what the graphs were showing or, to a lesser extent, could identify 
technical shortcomings in the graphical presentation. In contrast, a similar number 
either made no comment at all, or could attempt only vague descriptions. As seen in 
Tables 3 and 5, less than 10% of the students appeared to check the data in any way 
for consistency. Presumably the remaining students took the data at face value. 

Performance in relation to the VAR coding was similarly disappointing, with well 
over half of the students in the first survey not considering the variation in the graphs 
as something that might be regarded as unusual. If variation was acknowledged at all, 
in most cases it was because students identified particular values, notably extreme 
values. Students rarely identified — or, at least, commented on — trends or variation in 
data. There seems to be an inability (or unwillingness) to step back from individual 
data points in the graphs to make meaning on a larger scale. Those who commented 
on variation generally gave enough discussion in their response to warrant a Level 2 
classification on the CSL coding, but those few who identified errors (at Level 3 on 
the CSL scale) usually were at only Level 1 on the VAR coding. 

Implications for teachers 

The results suggest that there has been a lack of attention to variation and statistical 
literacy with respect to graphs in the media. It is suspected that students are given 
opportunities to construct graphs based on data, to comment on the technical 
presentation of existing graphs, and to read off values from graphs and tables, but that 
critical evaluation and higher level analysis are rarely explicitly fostered. Examples 
such as the boating deaths graphs used here are not rare in the media, but students 
need activities that help them to focus on whether the data are internally consistent, 
whether there are unusual values, whether there are any trends in the data, and how 
the data vary. Most importantly, they need to make meaning from the data. Teachers 
can model the kinds of questions that students could and should ask when examining 
data. A discussion might proceed along the following teacher-directed sequence, 
depending on students’ intermediate responses. “What story do the graphs tell? ... Is 
there anything about them that you consider unusual? . . . Are there any mistakes in 
the graphs? . . . How might you tell the story better?” 
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Limitations and directions for future research 

It should be noted that some students may have lacked motivation to respond 
seriously to the task, as indicated by some terse or coarsely expressed responses. The 
authors are also aware that some students may not have been able to express 
themselves clearly or in detail when completing a written survey, especially given its 
length. The use of an interview setting is likely to provide richer data. It would also 
be interesting to use the task with adults, because their experience of real world data 
since leaving school may affect the kinds of things they perceive as unusual. Finally, 
can explicit instruction in looking for the unusual in data help students make this their 
usual approach to examining statistical material? 
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